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Abstract. Elliptic partial differential equations (PDEs) with discontinuous diffusion coefficients occur in application 
domains such as diffusions through porous media, electro-magnetic field propagation on heterogeneous media, and diffusion 
processes on rough surfaces. The standard approach to numerically treating such problems using finite element methods is to 
assume that the discontinuities lie on the boundaries of the cells in the initial triangulation. However, this does not match 
applications where discontinuities occur on curves, surfaces, or manifolds, and could even be unknown beforehand. One of the 
obstacles to treating such discontinuity problems is that the usual perturbation theory for elliptic PDEs assumes bounds for 
the distortion of the coefficients in the Loo norm and this in turn requires that the discontinuities are matched exactly when 
the coefficients are approximated. We present a new approach based on distortion of the coefficients in an Lq norm with q < oo 
which therefore does not require the exact matching of the discontinuities. We then use this new distortion theory to formulate 
new adaptive finite element methods (AFEMs) for such discontinuity problems. We show that such AFEMs are optimal in the 
sense of distortion versus number of computations, and report insightful numerical results supporting our analysis. 
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1. Introduction. We consider elliptic partial differential equations of the following form 

-d\Y{AVu) ^ f, onQ ^ ^ 
u = Q, on dft. 

where is a polyhedral domain in M'^, d > 1 integer, and A = {aij)f is a d x d positive definite matrix 
of Loo(i^) functions. 

We let I • I denote the Euclidean norm on M.'^ and when w : — > M'' is a vector valued function defined 
on n then we set 

MLAn) II H IU,(n), (1.2) 

for each < p < oo. Similarly, if B is any d x d matrix, then ||i3|| denotes its spectral norm (its norm as an 
operator from £2{M.'^) to itself). If S is a matrix valued function on fl then we define the norms 

\\B\\LAn) ■■= II \\B\\ |U,(o). (1.3) 
By redefining the on a set of measure zero, we may assume that each Uij is defined everywhere on fl and 

P(x)|| < ||A|U^(o), xen. (1.4) 

As usual, we interpret in the weak sense and use the Lax-Milgram theory for existence and unique- 
ness. Accordingly, we let H^fl) be the Sobolev space of real valued functions on fl which vanish on the 
boundary of equipped with the norm 

MHl,{n} ■= l|Vw||L2(f2) (1.5) 

and we define the quadratic form 

a(u, w) := / (AVu) . Vw, u,v e H^{n). (1.6) 
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Throughout, we shall use a • 6 to denote the inner product of vectors a and b. 

To ensure uniform ellipticity, we assume that A is symmetric and uniformly positive definite a.e. on 
n. Again, without loss of generality, we can redefine A on a set of measure zero so that A{x) is uniformly 
positive definite everywhere on fl. Given a positive definite, symmetric matrix B, we denote by XminiB) its 
smallest eigenvalue and by Aniax(S) its largest eigenvalue. In the case that i? is a function of a; e il, we 
define 



and 



Amin(-B) := inf XauniBix)), 



Amax(B) SUpA,„ax(B(x)) = || A,„ax (^CO ) II 



ioo(n) = ll-S||L„(n). 



It follows that 



Ainin {A)\y\'^ < y*A{x)y < Ainax(^)|2/P, Vx e fi, ye 



Let us also note that (1.71 implies 

Amin(^)||l'|lHi(0) < < Vax(^)||l'|lHi(a)' 



(1.7) 



(1.8) 



for all V G Hq{U). That is, the energy norm induced by a(-, •) is equivalent to the Hq norm. 

Given / € H~^{fl) := Hq{U)* (the dual space of {{^{il)), the Lax-Milgram theory implies the existence 
of a unique u = Uf £ Hq (fl) such that 



i{u,v)^{f,v), veH^in), 



(1.9) 



where (/, v) is the H ^ — Hq dual pairing. 



Practical numerical algorithms for solving (1.9), i.e. finding an approximation to u in HQ{il) to any 



prescribed accuracy e, begin by approximating / by an / and A by an A; this is the case, for example, 
when quadrature rules are applied. To analyze the performance of such an algorithm therefore requires an 
estimate for the effect of such a replacement. The usual form of such a perturbation result is the following 
(see e.g. [E]). Suppose that both A, A are symmetric, positive definite and satisfy 



^ — Aniin 

(A), A,„ax(A) < M, f < A„ii„(A), Aniax(A) < M, 

for some Q < r < M < oo and < f < Af < oo. Then, 



(1.10) 



Ik - "llifi(o) < ( 11/ - /IIh-i(o) + r-'\\A - A\\L^^n)\\f\\H-Hn) 



(1.11) 



where u G Hq{^1) i s the solution of (1.91 with diffusion matrix A and right hand side /. If A has discon- 
tinuities, then for (1.11 1 to be useful, the approximation A would have to match these discontinuities in 



order for the right side to be small. In many applications, the discontinuities of A are either unknown or lie 
along curves and surfaces which cannot be captured exactly. This precludes the direct use of ( |1.11| in the 
construction and analysis of numerical methods for ( 1.1 ). 



The first goal of the present paper is to describe a perturbation theory, given in Theorem |2.1| of §2.1 
which replaces ( |1.11[ ) by the bound 



< r 



11/ - fU-Hn) + II Vu|U^(o)||A - A\\L^(^n) , q ■= 



2p 



(1.12) 



provided Vu € Lp{fl) for some p > 2. Notice that whe n p = 2 this estimate is of the same form as ( |1.11[ ) 



because UVuH^^/s-;) < r ||/||h-i(si)- The advantage of (1.12) over (1.11) is that we do not have to match 



the discontinuities of A exactly for the right side to be small. Note however that we still require bounds on 
the eigenvalues of A and in particular A G L°°(fJ). 



However, estimate (1.12 1 exhibits an asymmetry in the dependency of the eigenvalues of A and A and 



requires additional assumptions on the right side / to guarantee that Vit € Lp{fl). This issue is discussed 
in §2.2[ It turns out that there is a range of p > 2, depending only on and the constants r,M such that 
/ e W~^{Lp{n)) implies Vw G Lp(fl) and so the estimate (1.12 1 can be applied for such /. The restriction 



that f G W ^{Lp{n)), for some p > 2, is quite mild and is met by all applications that we envisage. 

The second goal of this paper, is to develop an adaptive finite element method (AFEM) applicable to 



( 1.9 ) when A possesses discontinuities not aligned with the meshes and thus not resolved by the finite element 
approximation in L^o ■ We develop such adaptive methods based on newest vertex bisection in ^ and prove 
that our method has a certain optimality in terms of rates of convergence. We note that it is convenient to 
restrict our discussion to newest vertex subdivision and the case d = 2 for notational reasons. However, all 
of our results hold for more general d>2 and other refinement procedures such as those discussed in [7] . 

The adaptive algorithm that we propose and analyze is based on three subroutines RHS, COEFF, and 
PDE. The first of these gives an approximation to / using piecewise polynomials. This type of approximation 
of / is quite standard in AFEMs. The subroutine COEFF produces an approximation A to yl in Lg. We 
need, however, that A is uniformly positive definite with bounds on the eigenvalues of A comparable to the 
bounds assumed on A, a restriction that seems on the surface to be in conflict with approximation in Lg. 
We show in ^ that on a theoretical level the restriction of positive definiteness of A does not effect the 
approximation order in Lq. However, the derivation of numerically implementable algorithms which ensure 
positive definiteness and perform optimally in terms of Lg{fl) approximation is a more subtle issue because 
there is a need to clarify in what sense A is provided to us. We leave this aspect as an open area for further 
study. Finally, we denote by PDE the standard AFEM method [TH|, but based on the approximate right 
hand side / and diffusion coefficient A provided by RHS and COEFF. 

We end this paper by providing an insightful numerical illustration on the performance of the new 



algorithm along with the key fact that (1.12) can be applied locally. 



2. Perturbation Argument. In this section, we prove a perturbation theorem which allows for the 
approximation of A to take place in a norm weaker than Loo • As we shall see, this in turn requires Vu G Lp{fl) 
for some p > 2. Validity of such bounds is discussed in §2.2| 

The Perturbation Theorem. Let A,Ae [Loo(i^)]''^'^ be symmetric, positive definite matrices 



2.1. 



satisfying (1.10 1, for some r, f > and some M,M < oo, and let f,fEH ^{^). Let u,u E ^^o(i^) be the 
solution of (1.9) and of the perturbed problem 



(2.1) 



We now prove that the map yl i— > u is Lipschitz continuous from Lg(i7) to 11^(^1). This map is shown to be 
continuous in [13, §8, Theorem 3.1]. 

Theorem 2.1 (Perturbation Theorem). For any p > 2, the functions u and u satisfy 



\u~u\\Hl{n)<f ^\\f - f\\H-^(n)+f ^||Vu||lp(j7)||A- i||i^(o), q 



2p 
p~2 



e[2, 



(2.2) 



provided Vu G Lp(r2). 

Proof. Let u be the solution to ( 1.1 ) with diffusion matrix A and right side /. Then, from the perturbation 
estimate ( |1.11[ ), we have 



We are therefore left with bounding ||u — u\\fji(^^y From the definition of u and u, we have 

{AVu) • Vw = / (iVu) • V?;, 



(2.3) 



for all V e i?o(ri). This gives 



[iV(w -u)]-Wv= / [{A - A)Wu] ■ Vw 



for all V e Hq (f2) . Taking v = u — u, we obtain 

[iV(w - u)] • V(u - u) = / [{A - A)\/u] ■\/{u-u)< / II A - All |Vm| |V(u - m) 



To the right integral, we apply Holder's inequality with q and q' — G [1,2] to obtain 
[Aviu~u)]-Viu-u)\<\\A-A\\L^(^n)\\ \yu\ \Viu-u)\ |U^,(n). 

We now take s = 4 G [1, 2] and apply Holder's inequality again to arrive at 



(2.4) 



/ 



Vm|«'|V("- w)l'' < l|Vu||^ . .,o^llV(^t-^2^l'' 



L2(n)- 



If we inject this last inequality into (2.4) and use the coercivity estimate (1.8) with A replaced by A, then 
we deduce 

If we define p — s'q' , then p can take any value in [2, oo] and q — -3^, whence 



h~ "llffiaj) < r ^||Vu||i^(o)P- i||L,(n)- 



(2.5) 



Combining this with (2.3), we infer that 



Ik - ^llifi(o) < \\u - u\\Hi^n} + \\u - u\\H^(n) 

< r-^||V7.|U^(o)||A - i|U,(o) + f-'Wf - /Ik-Ha), 

as desired. □ 

Remark 1 (Local Perturbation Estimates). We point out that the choice of p in the perturbation 
estimate (2.2) could be different from one subdomain of ft to another. To fix ideas, assume that ft is 
decomposed into two subdomains fli and i72- Similar arguments as provided in the previous lemma yield 

\\u-u\\Hl,(n) < r-^\\f - fU-Hn) + r-^\\A~ A\\L^^(^n^)\\yu\\L^^(^n,) + r-^\\A- A\\l^^^^^^ 

where pi G [2, 00] and qt = 2pi/{pi — 2), i = 1, 2. As we shall see in ^ this turns out to be critical when the 
jump in the coefficients takes place in a subdomain fli with the solution u S W^{fli), thereby allowing to 
take Pi = CO. 



2.2. SufRcient conditions for Vu to be in i„. In order for Theorem 2.1 to be relevant we need that 



Vu is in Lp for some p > 2. It is therefore of interest to know of sufRcient conditions on A and the right side 
/ for this to be the case. In this section, we shall recall some known results in this direction. 

From the Lax-Milgram theory, we know that the solution operator boundedly maps i/^^(il) into HQ{n). 
It is natural to ask whether this mapping property extends to p > 2, that is, whether we have 
Condition p : For each f e W^^{Lp{Q,)), the solution u = Uf satisfies 



\u\w^Lp{n)) ■= ||V-u||lp(o) < Cp\\f\\w-^Lp(n)), 
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(2.6) 



with the constant Cp independent of f. 

Remark 2 (Local Condition p ) . As already noted in Remark [l] it is not necessary for the p to be 
uniform over n. In particular, one could decompose fl on subdomains on which Condition p is valid for 
different p's. This is used in ^for the numerical illustration of the method. 

When A = I (the case of Laplace's equation), the validity of Condition p is a well studied problem 
in Harmonic Analysis. It is known that for each Lipschitz domain fl, there is a P > 2 which depends on 
such that Condition p holds for all 2 < p < P (see for example Jerison and Kenig [15 J. In fact, one have 
in this setting P > 4 when d = 2 and P > 3 when d = 3. For later use when A = I, we denote by K the 
constant depending only on O and P for which 

\\yu\\L,in) < K\\f\\w-^LAn)}- (2.7) 

For more general A, Condition p can be shown to hold by using a perturbation argument given by 
Meyers [l7](see also Brenner and Scott [8]). We shall describe Meyers' result only in the case p > 2. We let 

1/2 - 1/p , , 

and note that ri{p) increases from the value zero at p = 2 to the value one at p ~ P. For any t G (0, 1), we 
define 

p*{t) argmax{is:-''(P) > 1 - i : 2 < p < P}. (2.9) 

With these definitions in hand, we have the following result for general A. Although this result is known, 
we provide the following simple proof for completeness of this section. 

Proposition 1 (Meyers |17j). Assume that f and are such that for some P > 2 and some constant 
K , the solution u G i?Q (SI) of problem ( 1.9 ) for Laplace 's equation satisfies (2.7) whenever f e W~^{Lp(fl)). 
//(I.IO) is valid for A, then the solution u G Hq{Q)) o/(1.9) satisfies 

I|Vu||lp(o) < C\\f\\w~^L^(o.)): 

provided 2<p < p*{r/M) and C := if • 

Proof. The main idea of the proof is to write A as a perturbation of the identity and deduce the L^-bound 
on Vu from the LP-bound for the solution of the Poisson problem. 

The operator T := —A is invertible from H^^iVl) to Hl{Vl), and its inverse T^^ is bounded with norm 
one. From (2.7), it is also bounded with norm as a mapping from W~^{Lp{n)) to Wq{Lp{^1)), where we 
define the norm on W^{Lp{Vl)) by its semi-norm. For the real method of interpolation, we have for 2 < p < P, 
W^[Lp{Vt)) = [Hl{n), W^{Lp{Vt))]^i^p)^p, where r]{p) is defined in It follows by interpolation that T'^ 

is a bounded mapping from VF~^(Lp(Sl)) to W^{Lp{^)) and 

l|vr-V|u.(o) <x''(^)||/|| iv-i(Lp(n)). 

Let S : W^{Lp[ri)) — > W^^{Lp{n)) denote the operator satisfying Sv :— — div (jjAVw). For conve- 
nience, we also define the perturbation operator Q := T — S . Then, S and Q are bounded operators from 
W^{Lp{n)) to W~'^[Lp{n)) with norms 

||5||<1 and ||Q||<l-i^. 

It follows that as a mapping from WQ{Lp{n)) to Wq{Lp{^)) 

\\T-^Q\\ < ||T-i||||Q|| </f''(P)(l-^). 

Hence, S ^ T{I - P-^Q) is invertible provided ^''(^^(l - ^) < 1, that is, provided 2 < p < p*{r/M). 
Moreover, as a mapping from W^^{Lp{n.)) to W^q (Lj,(r2)) 

\\s-H< .." ^< 



1 - K^{p)[i - jy) - 1 - /s:')(p)(i - X.) 



which yields the desired bound. □ 



3. Adaptive Finite Element Methods. There is by now a considerable literature which constructs 
and analyzes AFEMs. Our new algorithm differs from those existing in the literature in the assumptions 
we make on the diffusion matrix A. Typically, it is assumed that each entry in this matrix is a piecewise 
polynomial on the initial partition To or at a minimum that it is piecewise smooth on the partition To- We 
shall treat general A (which may be discontinuous) with no assumption that the discontinuities are compatible 
with To or even known to us a priori. However, our algorithms use subroutines that appear in the standard 
AFEMs and can be seen as an extension of [13 E] where the approximation of / is discussed. Therefore, 
we shall review the existing algorithms in this section. We refer the reader to Nochetto et al. [18] for an up 
to date survey of the current theory of AFEMs for elliptic problems. Unless noted otherwise, the proofs of 
all the results quoted here can be found in [18j . 

3.1. Partitions and Finite Element Spaces. Underlying any AFEM is a method for adaptively 
partitioning the domain into polyhedral cells. Since there are, by now, several papers which give a complete 
presentation of refinement rules used in AFEMs, for example [7], we assume the reader is familiar with these 
methods of partitioning. In the discussion that follows, we will consider the two dimensional case (triangles) 
and the method of newest vertex bisection, but the results we present hold for d > 2 and more general 
refinement rules satisfying Conditions 3, 4 and 6 in [Tj- In particular, they hold for successive bisections, 
quad-refinement, and red-refinement all with hanging nodes. It is simply for notational convenience that we 
limit our discussion to newest vertex bisection. 

The starting point for newest vertex partitioning is to assume that is a polygonal domain and To is 
an initial partition of fl into a finite number of triangles each with a newest vertex label. It is assumed that 
the initial labeling of vertices of To is compatible; see [H HO] ■ If a cell is to be refined, it is divided into two 
cells by bisecting the edge opposite to the newest vertex and labeling the newly created vertex for the two 
children cells. This bisection rule gives a unique refinement procedure and an ensuing forest T emanating 
from the root To- 

We say a partition T G T is admissible if it can be obtained from To by a finite number of newest 
vertex bisections. The complexity of T can be measured by the number rt(T) of bisections that need to be 
performed to obtain T from To: in fact, #T = #To + n{T). We denote by T„, n > 1, the set of all partitions 
T that can be obtained from To by n newest vertex bisections. 

A general the triangulation T S may be non-conforming, i.e., contain hanging nodes. If T is non- 
conforming, then it is known [21 120[ [7j that it can be refined to a conforming partition T by applying a 
number of newest vertex bisections controlled by ?i(T), namely, 

#T-#%<Con{T), (3.1) 

with Co an absolute constant depending only on the initial partition To and its labeling. We denote by 

CONF(r) 

the smallest conforming admissible partition which contains T. 

Given a conforming partition T G T„ and a polynomial degree m > 1, we define V(T) to be the 
finite element space of continuous piecewise polynomials of degree at most m subordinate to T. Given 
a positive definite diffusion matrix A E Loo(f^), and a right side / € L2{fl), the Galerkin approximation 



U :— U{T, A, /) := GAL(T, A, f) of ( 1.9 1 is by definition the unique solution of the discrete problem 



C/eV(r): J{A\7U)-VV = J f V, G V(r). (3.2) 

n n 

Notice that given T, the function U is the best approximation to u from V(T) in the energy norm 
induced by A which is in turn equivalent to the HqITI) norm. 

3.2. The structure of AFEM. Standard AFEMs for approximating u generate a sequence of nested 
admissible, conforming partitions {Tfc}fc>o of il starting from To. The partition TI-+1 is obtained from Tfe, k > 
0, by using an adaptive strategy. By examining residuals, certain cells in Tk are chosen for refinement (this 
choice of cells is called marking). After performing these refinements (and possibly additional refinements 
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to remove hanging nodes), we obtain the partition Tk+i, and the new Galerkin solution Uk+i- We denote 
this procedure by PDE and formally write 



[Tk+i, Uk+i] = PDE(rfc, ^, /, ek), \\u - C/fc+i|Ui(n) < £fc. 

In other words the input to PDE is the partition 71-, the matrix A, the right side / and the target error 
ek- The output is the partition Tk+i and the new Galerkin solution Uk+i which satisfies the desired error. 



This idealized algorithm does not carefully handle the error incurred in the formulation and solution of (3.2 1, 
namely in the procedure GAL(T, A, /) [TB]. This step requires the computation of integrals that are products 
of / or A with functions from the finite element space. In performance analysis of such algorithms, it is 
typically assumed that these integrals are computed exactly, while in fact they are computed by quadrature 
rules. The effect of quadrature is not assessed in a pure a posteriori context. One alternative, advocated 
in [21 US] for the Laplace operator, is to approximate / by a suitable piecewise polynomial fk over Tk- Of 
course, one still needs to understand in what sense / and A are given to us, a critical issue not addressed 
here. 

Our AFEM differs from PDE in that we use approximations to both / and A, the latter being crucial 
to the method. Our view of the main iteration of AFEM is as follows. We are given a current partition Tk 
and a target error Ck for the -ffo(f^) error between u and the next Galerkin approximation Uk+i G V(7a;+i). 
The AFEM will first find an admissible conforming partition 7^' , which is a refinement of Tk, on which we 
can approximate / by a piecewise polynomial fk and likewise A by a piecewise polynomial Ak such that 

11/ - fk\\H-Hn) < 4, 11^ - ML^m < 4, (3.3) 

with q = 2p/{p — 2) G [2,oo] (the existing algorithms in the literature always take q = oo [Hj)- The 
tolerance ej. is chosen as a multiple of e^, for example ej, = we^ with w > yet to be determined. We next 
apply [Tk+i; Uk+i] ~ PDE(7fc', Ak, fk, efc/2) to find the new admissible conforming partition Tk+i, which is a 
refinement of 7^', and so of Tfe, and Galerkin solution Uk+i satisfying 

\\uk - Uk+i\\H^{n) < 



where Uk is the solution to ( 1.9 1 with diffusion matrix Ak and right side fk- ^From the perturbation estimate 



(2.2), with f > a bound for the minimum eigenvalue of Ak, we obtain 

||u- t/fc+i||_f/i(n) < II" - WfellHi(ji) + \\uk - t^fe+i||Hi(f2) 

< ^^"'11/ - MIh-i(o) + r-i||Vu|U^(o)P - ^fc||L,(o) + 



Therefore, invoking (2.61 and choosing e'j, (or lo) sufficiently small, we get the desired bound 



h-Uk+i\\Hiin)<r-'{l + Cp\\f\\w-^(L,n)))e'k + j<ek- (3.4) 

We see that such an AFEM has three basic subroutines. At iteration k, the first one is an algorithm 
RHS which provides the approximation fk to /, the second one is an algorithm COEFF which provides the 
approximation Ak to A, and the third one is PDE which does the marking and further refinement Tk+i to 
drive down the error of the Galerkin approximation Uk+i to u. We discuss each of these in somewhat more 
detail now. 

We denote by RHS the algorithm which generates the approximation to /. It takes as input a function 
/ G H~^(fl), a conforming partition T, and a tolerance e. The algorithm then outputs 

[t,/] = RHS(/,r,e) 

where T is a conforming partition which is a refinement of T and / is a piecewise polynomial of degree at 
most nif subordinate to T such that 

||/-/||h-i(0) <e. (3.5) 
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Notice that we do not assume any regularity for /. In theory, one could construct such RHS, but in practice 
one needs more information on / to realize such algorithm as we now discuss. 

By far, the majority of AFEMs assume that / g L2{^) but recent work [191 [11] treats the case of certain 
more general right sides / € H^^{n). If f E L2(ri), then one can bound the error in approximating / by 
piecewise polynomials of degree at most by 

\\f'f\\H-Hn)<oscif,T) [Y. hUf - aTiml^T))\ (3-6) 



TeT 



where axif) is the L^(T) orthogonal projection of / onto Pm^._i(T), the space of polynomials of total degree 



< uif — 1 over r, and /|t := cltH) for T eT- T. The right side of (3.6) is called the oscillation of / on T. 
For any concrete realization of RHS, one needs a model for what information is available about /. We refer 
to [11] for further discussions in this direction. 

Similarly, one needs to approximate A in the AFEM. Given a positive definite and bounded diffusion 
matrix A, a conforming partition T and a tolerance e, the procedure 

[f,A] = COEFF(A,r,e) 

outputs a conforming partition T, which is a refinement of T, and a diffusion matrix A, which has piecewise 
polynomial components of degree at most mA subordinate to T and satisfies 

P-i||L,(0) <e, (3.7) 

for q = 2p/{p — 2) G [2, oo]. In addition, in order to guarantee the positive definiteness of A, we require that 
there is a known constant C2 for which we have 

(^)< Tin ax 

(A) < C27max(^)- (3.8) 

In ^we discuss constructions of COEFF (and briefly mention constructions for RHS) that have the above 
properties and in addition are optimal in the sense of §3.3[ 

If A is a piecewise polynomial matrix of degree < on the initial partition To, then one would have 
an exact representation of A as a polynomial on each cell T of any partition T and there is no need to 
approximate A. For more general A, the standard approach is to approximate A in the Lao{^) norm by 
piecewise polynomials. This requires that A is piecewise smooth on the initial partition To in order to 
guarantee that this Loo error can be made arbitrarily small; thus q = 00. The new perturbation theory we 
have given allows one to circumvent this restrictive assumption on A required by standard AFEM. Namely, 
it is enough to assume that Condition p holds for some p > 2 since then q < 00. 

3.3. Measuring the performance of AFEM. The ultimate goal of an AFEM is to produce a quasi- 
best approximation U to u with error measured in || • ||//i(si)- The performance of the AFEM is measured 
by the size oi \\u — U\\H^(n) relative to the size of the partition T. The size of T usually reflects the total 
computational cost of implementing the algorithm. As a benchmark, it is useful to compare the performance 
of the AFEM with the best approximation of u, f and A provided we have full knowledge of them. 

Approximating u. For each n > 1, we define S„ to be the union of all the V(T) C -ffo (^) with T G In (the 
set of non-conforming partitions obtained from To by at most n refinements) . Notice that S„ is a nonlinear 
class of functions. Given any function v £ Hq{Q,), we denote by 

(Jn{v) := <Tn{v)Hl,{n) — ^mf \\v - n > 1, 

the error of the best approximation of v by elements of I]„. Using cr„, we can stratify the space Hq{Q,) into 
approximation classes: for any s > 0, we define A"^ A^iTo, iJg (fi)) as the set of all v £ ^o (i^) for which 

\v\a- ■■= sup (nV„(w)^i(o)) < 00; (3.9) 

n>l ^ ^ 
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the quantity is a quasi-norni. Notice that as s increases the cost of membership to be in A'^ increases. 



For example, we have A'^^ C A'^'^ whenever S2 < si- One can only expect (3.91 for a certain range of s, 
namely < s < where S — m/d is the natural bound on the order of approximation imposed by the 
polynomial degree m being used. 

Since the output of AFEMs are conforming partitions, it is important to understand whether the im- 
position that the partitions arc conforming has any serious effect on the approximation classes. In view of 



(3.1), we have that for any v e A^, there exist Sn G 5]„ subordinate to a conforming partition for which 

\\v-Sn\\Hiin)<Q\v\A^n-', n>l, (3.10) 



with Co the constant in (3.1). Thus, if we had defined the approximation classes A'^ with the additional 
requirement that the underlying partitions are conforming, then we would get the same approximation class 
and an equivalent quasi-norm. 

As mentioned above, the input to routine PDE includes polynomial approximations / and A to f and 



A and then the algorithm produces an approximation to the solution u of (1.91 with diffusion coefficient A 
and right hand side /. However, u ^ A'' does not guarantee that u e A'^ , which motivates us to introduce 
the following definition. 

Definition 3.1 (e— approximation of order s). Given u e A"^ and e > 0, a function v is said to be an 
e— approximation of order s to u if \\u — ^ ^ o,"^^ there exists a constant C independent of e, u, and 

V, such that for all 5 > e there exists n G N with 

'Tniv)Him<5, n<C\u\]l:S-'/\ (3.11) 



We remark that if ei < £2 and v is an ei— approximation of order s to u, then v is an £2— approximation of 
order s to w as well. We now provide a lemma characterizing such functions. 

Lemma 3.2 (e— approximations of order s). Let u E A''{To, i/p (il)) and v G i?o (il) satisfy t;||^i(-Q) < 
e for some e > 0. Then v is a 2e- approximation of order s to u. 

Proof. Let 5 > 2e. It suffices to invoke a triangle inequality to realize that 



C'n(w)Hi(n) < \\u - -"11^1(0)) + crn(w)Hi(0) < '5/2 + Cr„ (m)^! (fl) • 



(3.12) 



Since u e A'^ {To, Hg{n)) we deduce that there exists n < 
Estimate (|3.11|) thus follows with C = 2^/'. □ 



\u\]l!iS/2)-^/' such that fT„(u)^i(o) < S/2. 



In view of this discussion, we make the assumption that the call [T,U] — PDE(T, A, /,e) deals with 
approximate data A and / exactly and creates no further errors. We say that PDE is of class optimal 
performance in A^ if there is an absolute constant C3 such that the number of elements N(u) marked for 
refinement on T to achieve the error \\u - 



U\\H^(n) < e satisfies 



(3.13) 



whenever the solution -u to ( 1.9 1 with data A and / is in ^"(To, ^0 (^^))- This is a slight abuse of terminology 
because this algorithm is just near class optimal due to the presence of C3. We drop the word 'near' in what 
follows for this and other algorithms. Moreover, notice that we distinguish between the elements selected for 



refinement by the algorithm and those chosen to ensure conforming meshes. Estimate (3.13) only concerns 



the former since the latter may not satisfy (3.13) in general; see for instance [21 118j. 



Approximating /. We can measure the performance of the approximation of / in a similar way. We let 
S™ be the space of all piecewise polynomials S of degree at most m > subordinate to a partition T G Tn, 
and then define 



ct«(/)h-i(o) :=cr"(/)if- 



:= inf 



\f-S\\H- 



n > 1. 
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In analogy with the class A^, we let the class :— /B*(7o, H ^{^)), s > 0, consist of all functions f E H -'^(fl) 
for which 

|/|e= := sup (nV„(/)ff-i(n)) < oo. (3.14) 

n>l ^ ^ 

We will also need to consider the approximation of functions in other norms. If < g < oo and g G Lq{fl) 
{g £ C(r2) in the case q — oo), we define 

o-n(/)L,(n) := <"(/)l,(J2) := ^inf^ 11/ - -SlU^Cn), n>l, 

and the corresponding approximation classes B'^{Lq{il)) := B'' {To, Lq{il.)), s > 0, consisting of all fmictions 
Lg[fl) for which 

l/l8=(L,(n)) := sup (nV„(/)i^(n)) < oo. (3.15) 

n>l ^ ^ 

Let us now see what performance we can expect of the algorithm RHS. If / G B'^{To,H^^{n)), then 
there are partitions T* with #T* — #7o < l/le'''*^"^^'' on which we can find a piecewise polynomial S* such 
that 11/ — S\\H-i-(n) ^ £• Given any T obtained as a refinement of To, the overlay T* © T of T* with T has 
cardinality obeying [18] 

#(r ® T) - #r < #r* - #ro < |/lB('e-l/^ (3.16) 

Notice that at this stage the partitions might not be conforming. This motivates us to say that the algorithm 
RHS has class optimal performance on if the number N{f) of elements chosen to be refined by the 
algorithm to achieve a tolerance e starting from T, always satisfies 

N{f) < Cslfl'Jfe-^ (3.17) 
with C3 an absolute constant. We refer to ^for the construction of such algorithms. 

Approximating A. With slight abuse of notation, we denote again by S™ the class of piecewise polynomial 
matrices of degree < m subordinate to a partition T E In- The best approximation error of A within E™ 
is given by 

We denote by := M'^ {%, Lq{Q j) the class of all matrices such that 

\A\m^ ■■= sup (n'(Jn{A)LJn)] < oo. (3.18) 

n>l ^ ' 

This accounts for Lq approximability. But in our application of the algorithm COEFF, we need that the 



matrix A is also positive definite to make use of the perturbation estimate (2.2 1. We show later in 5 5.2.1 



that if we know the bounds (1.10 1 for the eigenvalues of A, then there is a constant C4 and a piecewise 
polynomial matrix A of degree < rti such that 

P-i||L,(^2) <C4fTn(^)L,(0) (3.19) 



where the eigenvalues of A satisfy ( 1.10 ) for some f and M comparable to r and M respectively. The constant 
C4 is independent of r, M and n. 

In analogy to PDE and RHS, we say that the algorithm COEFF has class optimal performance on A^** if 
the number N{A) of elements marked for refinement to achieve the tolerance e starting from a partition T 
always satisfies 

N{A) < CMl'/^U-^, (3.20) 
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with C3 an absolute constant. Again, we refer to ^for the construction of such algorithms. 

As with the approximation class A" earlier, we get exactly the same approximation classes = 
B''{To, H~^{n)) and Al" = A4''{To, Lq{n)) if we require in addition that the partitions are conforming. 

Performance of AFEM. Given the approximation classes A" , B'^ , , a goal for performance of an AFEM 
would be that whenever u G -4",/ G B'',A G Al*, the AFEM produces a sequence {7fe}A;>o of nested 
triangulations (Jk+i > Tk for each fc > 0) such that for A; > 1 



\\u-Uk\\H, 



< 



C{\ 



\f\B' + \A\M')im-Woy 



(3.21) 



with C a constant depending only on s. The bound ( 3.21[ ) is in the spirit of Binev et al [2], Stevenson |19| . 
and Gascon et al [9l in that the regularity of the triple {u, /, A) enters. It was recently shown in [TP that in 
the case A = a I, where a is a piecewise constant function and / the identity matrix, u G A'' implies / G S^, 
provided s < S. No such a result exists for A, which exhibits a multiplicative rather than additive structure. 

3.4. e — approximation and class optimal performance. As already noted in |3.3[ the context 
on which we invoke PDE is unusual in the sense that the diffusion coefficient and the right hand sides may 



change between iterations. Therefore, to justify (3.13) in our current setting, we will need some observations 



about how it is proved for instance in [21 |20j |9] ; see also [18] 



Let ii G H^{fl) be the solution of (1.9 1 with data A and /. Let U = GAL{T,AJ) G W{T) be the 



Galerkin solution subordinate to the input subdivision T and let e := 



U\\h, 



be the error achieved 



by the Galerkin solution on T- The control on how many cells are selected by PDE is done by comparing 
with the smallest partition T* which achieves accuracy /ie for some < /i < 1 depending on the Dorfler 
marking parameter and the scaling constants in the upper and lower a posteriori error estimates. That is, if 
A4 denotes the set of selected (marked) cells for refinement, one compares with if{T* ® T) — #7o to 
obtain 



-1/s 



(3.22) 



whenever u G A^ and where C3 is an absolute constant (depending on s). 

First, it is important to realize that the argument leading to (3.22) does not require the full regularity 

^331 



u G A'^ but only that u is an /xe-approximation of order s to some v G A'^; see 

Second, we note that several sub-iterations within PDE(T, A, /, e) might be required to achieve the 



tolerance e. However, each sub-iteration selects a number of cells satisfying (3.22) and the number of sub- 



iterations is proportional to e/e; see [6] for more details. Our new AFEM algorithm will keep this ratio 
bounded thereby ensuring that the number of sub-iterations within PDE remains uniformly bounded. 

In conclusion, combining Lemma 3.2 with (3.22), we realize that N = N(u), the number of cell marked 



for refinement by PDE(T, A, /, e) to achieve the desired tolerance e, satisfies 



N < CM 



1/s -1/s 



(3.23) 



provided that u is an /ie-approximation of order s to w G yl*(7o, i/g (Jl)) and that each call of PDE corresponds 
to a ratio e/e uniformly bounded. This proportionality constant is absorbed into C3. We will use this fact 
in the analysis of our new AFEM algorithm. 

4. AFEM for Discontinuous Diffusion Matrices: DISC. We are now in the position to formulate 
our new AFEM which will be denoted by DISC. It will consist of three main modules RHS, COEFF and PDE. 
While the algorithms RHS and PDE are standard, we recall that COEFF requires an approximation of A in 
Lq{n) instead of Loo(f^) for some q = 2p/{p — 2) where p is such that Condition p holds. 

We assume that the three algorithms RHS, COEFF, and PDE are known to be class optimal for all < 
s < S with S > 0. The algorithm DISC inputs an initial conforming subdivision To an initial tolerance ei, the 
matrix A and the right side / for which we know the solution u of ( 1.9 ) satisfies || VwHi^fn) < C'p||/||vi/- 



see Condition p in §2.2[ We now fix constants < w, /3 < 1 such that 



w < 



4C2(1 + Cp||/| 



PiiJ iiw-HLp{n))) 
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Hip)' 



(4.1) 



where < 1 is the constant of S3.4 and C2 appears in the uniform bound (3.81 on the eigenvalues. 
Given an initial mesh To and parameters eo,w,/3, the algorithm DISC sets fc and iterates: 

[Tk{f)Jk] = R^S{Tk,f,ujek) 
[TkiA),Ak] - COEFF{Tk{f),A,Ljek) 
[Tk+i,Uk+i] = PDE{Tk{A),AkJk,ek/2) 
efc+i = f3ek; k ^ k + 1. 

The following theorem shows the optimality of DISC. 

Theorem 4.1 (Optimality of DISC). Assume that the right side f is in B'^^ {H^^{fl)) with < Sf < S, 
that Condition p holds for some p > 2 and that the diffusion matrix A is positive definite, in Lao {^) o,nd 
in ^A'^^{Lq{^^)) for q := and < sa < S . Let To be the initial subdivision and Uk G V(7fe) be the 
Galerkin solution obtained at the kth iteration of the algorithm DISC. Then, whenever u £ A'^^'{H^{Vt)) for 
Q < Su < S , we have for k> 1 



II" - Uk\\Hl(n) < efc- 



and 



f^Tk — #7o < C4 



,1/s 



l^l7M = (L,(n)) 



l/le'=(_f/-i(n)) 



(4.2) 



(4.3) 



where C 4^ :^ Yzq^rfi and s ^ vtun{suT sa, s j) . 

Proof. Let us first prove (4.2). We denote by Uk the solution to (1.9) for the diffusion matrix A^ and 
right side fk. From (2.2) and since Condition p holds, we have 

II" - Wfcllifi(o) < f-'^Wf - fkWn-Hn) + f^^W^Ah^mWA - Ak\\L,{n)- 



In addition, the restriction (4.1 ) on uj and the bound (3.8) on the eigenvalues of Ak lead to 



II" - "fell//i(o) < (1 + Cp\\f\\w-i(L^))i^^k 



< 



(4.4) 



Since Uk+i satisfies ||wfc — C^fc+i||ffi(n) ^ £fe/2 and ^ < 1, the triangle inequality yields ( |4.2[ ) 



3ek 



\\u-Uk+i\\Hl,{n) < ^ < ^k, 



\/k > 0. 



Next, we prove (4.3). At each step j of the algorithm, the new partition Tj is generated from Tj-i by 
selecting cells for refinement and possibly others to ensure the conformity of Tj. We denote by Nj{f), Nj{A), 
and Nj the number of cells selected for refinement by the routines RHS, COEFF, and PDE respectively. The 
bound (3.1) accounts for the extra refinements to create conforming subdivisions, namely 

fc-i 

#rk - #ro <CoJ2 if) + (^) + 

The class optimality assumptions of RHS and COEFF directly imply that 



N,{f) < C,\f\ 



and N,iA)<C,\A\'l;.-,^^^^^^^{u;e, 



We cannot directly use that u G to bound Nj, because Nj is dictated by the sparsity of Uj, the solution 
of (1.9) with diffusion coefficient Aj and right hand side fj. Let Uj{A) = GAL{Tj{A), Aj, fj), set ij :— 
ll"j ~ Uj{A)\\ fjif^Qj, and assume that ij > ej/2 for otherwise the call PDE{Tj{A),Aj,fj,ej/2) is skipped and 
Ni = 0. 



In view of the discussion of 5 3.4 



estimate (3.23) is valid upon proving that 2ij/ej is uniformly bounded 
with respect to the iteration j and that Uj is an /xej/2-approximation (and in particular /xej-approximation) 
of order s to u. The latter is direct consequence of the estimate \\u — UjWfjii^^-^ < /iej/4 given in (4.4) and 
Lemma 3.2 so that only the uniform bound on 2ej/ej remains to be proved. 
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We recall the Galerkin projection property 



\\A]'^V{u, - C/,(A))|U,(o) < P/^VK - V)U,^n) 

holds for any V E Y{Tj{A)) and in particular for Uj E V(75) C V(7}(^)) because Tj{A) is a refinement of 
Tj. If rj — 7min(^j) and Mj — 7max(^j) denote the minimal and maximal eigenvalues of Aj, then the above 
Galerkin projection property and the triangle inequality yield 



1/2, 



= hj - C/j(A)||^i(j,) < {M,/r,y/^u, - U,\\Hi,in) 



Combining (4.2) and ej = /3ej-i, together with ( |4.4[ ) and (3.81, implies the desired bound 
i,/e, < (A/,/r,y/2(l//3 + ^/4) < C2{M/ry/'{l/P + ^i/A). 
The argument given in |3.4| guarantees the bound ( |3.23[ ), namely 

Gathering the bounds on Nj{f), Nj{A) and Nj, and using that w < 1, we deduce 



#Tk - Wo < CoC,{\A\\ 



/s 

A1 = (L,(0)) 



,1/s 



k-l 



-l/s 



The desired estimate (4.3) is obtained after writing ej — ^eu and recalling that < /3 < 1 so that 
Note that, we could as well state the conclusion of Theorem 14. 11 as 



UkWuKo.)) < C'(kU=(//i(f^)) + \ f\BHH-^(n)) + \A\M-{L,{Q.))j - #7B)" 



for a constant C independent of fc, whence DISC has optimal performance according to (3.21). 

Remark 3 (The case s < s„). We briefly discuss why the decay rate s = min(s„, s/, s^i) cannot 
be improved in (4.3) to s„ (the optimal rate for the approximation of u G A'^") by any algorithm using 
approximations A oi A and / of /. We focus on the effect of the diffusion coefficient A, assuming the right 
hand side / is exactly captured by the initial triangulation To , since a somewhat simpler argument holds for 
the approximation of /. 

The approximation of u by U, the Galerkin solution with diffusion coefficient A, cannot be better than 
that of A by A. Indeed, there are two constants c and C such that for any S > 



c5 < sup ||uAi - UA2\\min) < CS, 

Ai,A2<£B{A,S) 



(4.5) 



where ua^ € -ffg (fi), i = 1, 2, are the weak solutions of —dW{AiWuAi) = / and for 5 > 
B{A,S) :— X d positive matrices B \ \\A — -B||l,(o) < ■ 



While the right inequality is a direct consequence of the perturbation theorem (Theorem 2.1), the left 



inequality is obtained by the particular choice Ai 
implies that Ai,A2 e B{A,S) and ua^ = (l + ^ 

S 



\UAi - UA2\\H},m 



5 



11-4111,, (f! 

llwAillffi(a) 



A and A2 = [1 ^ 
UAi ■ Therefore 



ll^ll 



) ^ A. In fact, this choice 



r.1/2 



where < r < M < 00 are the lower and upper bounds for the eignevalues of A. 
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5. Algorithms RHS and COEFF. We have proven in Theorem [4~Tj the optimahty of the AFEM DISC 
provided the subroutines RHS and COEFF are themselves optimal. In this section, we discuss what is known 
about the construction of optimal algorithms for RHS and COEFF. A construction of algorithms of this type 
can be made at two levels. The first, which we shall call the theoretical level, addresses this problem by 
assuming we have complete knowledge of / or ^ and anything we need about them can be computed free 
of cost. This would be the case for example if / and A were known piecewise smooth functions on some 
fixed known partition (which could be unrelated to the initial partition) . The second level, which we call the 
practical level, realizes that in most applications of AFEMs, we do not precisely know f or A but what we 
can do, for example, is compute for any chosen query point x the value of these functions to high precision. 
It is obviously easier to construct theoretical algorithms and we shall primarily discuss this issue. 

5.1. Optimal algorithms for adaptive approximation of a function. The study and construction 
of algorithms like RHS is a central subject not only in adaptive finite element methods but also in approx- 
imation theory. These algorithms are needed in all AFEMs. The present paper is not intended to advance 
this particular subject. Rather, we want only to give an overview of what is known about such algorithms 
both at the theoretical and practical level. We begin by discussing Lq approximation. 

Two adaptive algorithms were introduced in [3] for approximating functions and were proven to be 
optimal in several settings. These algorithms are built on local error estimators. If T is a polyhedral cell, 
we define the local Lq error by 

E{T) ■.= E{g,T)^^^T) :=^^iW^^||.9~P|U,(T), (5.1) 

where Vm{T) is the space of polynomials of degree < m over T. Given a partition T, the best approxi- 
mation to 5 e Lq{^) by piecewise polynomials of degree < m is obtained by taking the best polynomial 
approximation Pt to g on T for each T € T and then 



Sr := J2 PtXt, (5.2) 



Ter 

where xt is the characteristic function of T. Its global error is 

.1/9 



£{g,T) |l5 - ^rllL,(n) = ( E ^(3' ^)l(T)) (5-3) 



So finding good approximations to g reduces to finding good partitions T with small cardinality. 

The algorithms in |?j adaptively build partitions by examining the local errors E{g, T)^ (q) for T in the 
current partition and then refining some of these cells based not only on the size of this error but also the 
past history. The procedure penalizes cells which arise from previous refinements that did not significantly 
reduce the error. The main result of |4] is that for < g < oo, these two algorithms start from To and 
construct partitions 7^, n — 1,2,..., such that Tn S 'Zen and 

£{g,Tn)<C an{g)L,{n), n = l,2,..., (5.4) 



where c > 1 and C > 1 are fixed constants. In view of (3.16), it follows that these algorithms are both 
optimal for Lq approximation, \ < q < oo, for all s > 0. 

While the above algorithms are optimal for Lq approximation, they are often replaced by the simpler 
strategy of marking and refining only the cells with largest local error. We describe and discuss one of these 
strategies known as the greedy algorithm. Given any refinement T of the initial mesh To, the procedure 
T' = T'{e) = GREEDY(T, 5, e) constructs a refinement T' of T such that £{g,T') < £■ 

To describe the algorithm, we first recall that the bisection rules of §3.1 [ define a unique forest % emanating 
from To. The elements in this forest can be given a unique lexicographic ordering. The algorithm reads: 

r(e) = GREEDY(r,ff,e) 

T' = r; 

while £{g,T') > e 
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T :=axgmaji{E{g,T) : TeV}; 

V := REFINE(r',r); 
end while 
r(e) = CONF(r') 

We see that GREEDY chooses an element T £ T with largest error E{g,T) and replaces T by its two 
children to produce the next conforming refinement T' until the error £{g,T') is below the prescribed 
tolerance e. The selection of T is done by choosing the smallest lexicographic T to break ties. The procedure 
T' = REFINE(T, T) replaces T by its two children. Upon exiting the while loop, additional refinements are 
made on T' by CONF to obtain the smallest conforming partition T(e) which contains T'. 
An important property of the local error E[T) is its monotonicity 

E{TiY ^ E{T2Y < E{TY VTer, (5.5) 

where Ti , T2 are the two children of T. This leads to a global monotonicity property 

E{gX)<^{9,T) (5.6) 

for all refinements T' of T whether conforming or not. 

While the greedy algorithm is not proven to be optimal in the sense of giving the rate 0{n^'^) for the 
entire class B''{Lq{il.)), it is known to be optimal on subclasses of B'^ . For example, it is known that any 
finite ball in the Besov space B^{Lr{i^)) with s/d > l/r—l/q and < s < m + l is contained in B''{Lq{il)). 
The following proposition shows that the greedy algorithm is optimal on these Besov balls. 

Proposition 2 (Performance of GREEDY). If g e B^{Lr{il)) with s/d > l/r-l/q andO < s < m + l, 
then GREEDY terminates in a finite number of steps and marks a total number of elements N{g) := ij=T' — fl^T 
satisfying 

iV(5)<C3|5l^''/(i^(o))£-'/^ (5.7) 
with a constant depending only onTo, \^\,t,s andq. Therefore, g G B^^'^{Tq, Lq{^iy) with \g\j3^i'i(To Lg(n}) ^ 

Results of this type have a long history beginning with the famous theorems of Birman and Somlyak 
[S] for Sobolev spaces, [3] for Besov spaces, and [10 for the analogous wavelet tree approximation; see also 
the expositions in [6l[TTl[T8]. If s/d = 1/t — 1/g, it turns out that there are functions in B^{Lr{^)) which 
are not in B'^/'^{J'q, Lq{^)) [3]. Therefore, the assumption s/d > 1/t — 1/q of Proposition is sharp in the 
Besov scale of spaces to obtain B^(L^(rj)) c B''/'^{%,Lq{9,)). We may thus say that GREEDY has near 
class optimal performance in B''^'^{To, Lq{fl)). 

Proposition[2]differs from these previous results in the marking of only one cell at each iteration. However, 
its proof follows the same reasoning as that given in [3] (see also [HI [18]) except for the following important 
point. The proofs in the literature assume that the greedy algorithm begins with the initial partition To and 
not a general partition T as stated in the proposition. This is an important distinction since our algorithm 
DISC is applied to general T. We now give a simple argument that shows that the number of elements 
N{g) = N{T,g) marked by GREEDY(T, g, e) starting from T satisfies 

N{g)<N = N{TQ,g) (5.8) 



and thus (5.7). We first recall that the bisection rules of |3.l| define a unique forest T emanating from To 
and a unique sequence of elements {Ti}^^ C 1 created by GREEDY(To, e). Let TJ^i = REFINE(T^', T^) be 
the intermediate subdivisions obtained within GREEDY(To,e) to refine Ti,l < i < N. Let A be the set of 
indices j € {1, . . . ,iV} such that Tj is never refined in the process to create T, i.e Tj is either an element 
of T or a successor of an element of T. If A = 0, then T is a refinement of T^_|_i, whence N{g) = and 
we have nothing to prove. If A 7^ 0, we let j be the smallest index in A and note that Tj G TJ_i with 
Tq = Tf). The definition of A in conjunction with the minimality of T^'_i implies that T is a refinement of 
Tj^i- Since Tj cannot be a successor of an element of T, because of the definition of Tj and the monotonicity 



property (5.5), we thus infer that Tj is an element of T. This ensures that Tj is the element with largest local 
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error (with lexicographic criteria to break ties) among the elements of T, and is thus the element selected 
by GREEDY(T, e). Therefore, GREEDY(T, e) chooses in order the elements T^, i E A, and stops when it 
exhausts A if not before, thereby leading to ( 5 



Finally, let us note that a similar analysis can be given for the construction of optimal algorithms for 
approximating the right hand side / in the H~^(fl) norm, except that H~^{rt) is not a local norm. Since 
this is reported on in detail in |TI] we do not discuss this further here. 

5.2. Optimal algorithms for COEFF. Given an integer m, we recall the space E„ :— S™ of matrix 
valued piecewise polynomial functions of degree < m, the error o'ri(^)L,(f2), and the approximation classes 
A^'*(Lg(r2)) that were introduced in S3.3 Assume that A E Loo{^) is a positive definite matrix valued 
function whose eigenvalues satisfy 



r < XminiA) < Xn,,^{A) < M. (5.9) 



This is equivalent to 



r < ^ a,j{x)z,Zj = z^A{x)z < M, \z\ = 1. (5.10) 

ij = l 

The construction of an algorithm COEFF to approximate A by elements B e has two components. The 
first one is to find good approximants B G Lq{D,) for q < oo. The second issue is to ensure that B is also 
positive definite. We study the latter in §5.2. 1| and the former in §5.2.2[ 

5.2.1. Enforcing positive definiteness. High order approximations B G of A, namely m > 0, 
may not be positive definite. We now show how to adjust B to make it positive definite without degrading 
its approximation to A. 

Proposition 3 (Enforcing positive definiteness locally) . Let A be a symmetric positive definite matrix 
valued function in Lq{n) whose eigenvalues are in [r, M] with M > 1. Let T d be any polyhedral cell 
which may arise from newest vertex bisection applied to To • Lf there is a matrix valued function B which is 
a polynomial of degree < m on T and satisfies 

P-S||l,(t) <e, 

then there is a positive definite matrix B whose entries are also polynomials of degree < m such that the 
eigenvalues of B satisfy 

r/2 < A,„i„(i3) < AmUB) < CM (5.11) 



\\A~B\\L^^T)<Ce, 

where C depends only on d,m and the initial partition To o-nd C depends additionally on M/r. 

Proof. Let us begin with an inverse (or Bernstein) inequality: there is a constant Ci, depending on m, d 
and To, such that for any polynomial P of degree < to in d variables and any polyhedron T which arises 
from newest vertex bisection, we have [S] 

\WP\\L^iT)<C,\\PU^(T)\T\-''''. (5.12) 

Now, given A and the approximation B, let 

Mq :— sup sup y*B{x)y. 

xeT\y\=l 

We first show that we can assume that Afo < CM, for a fixed constant C . We leave the value of C > 1 
open at this stage and derive restrictions on C as we proceed. If Mq > CAI, then there is a y with |?;| = 1 
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and an xo & T such that y'^B{xo)y = Mo > CM. We fix this y and consider the function a{x) := y*A{x)y 
and the polynomial P{x) :— y^B{x)y of degree m. Notice that |1^^|1loo(t) = Mq and 



(5.13) 



In view of (5.12), we have 

\P{x) - P{xo)\ < CiAfo|T|-i/'^|x - a;o| < Mo/2, \x - Xo\ < |r|i/V(2Ci). 

Let To be the set oi x £ T such that \x - xo\ < |T|i/''/(2Ci). Then P{x) > Mo/2 on Tq and hence 
\a{x) - P{x)\ > Afo/4 on Tq provided C > 4 because < a{x) < M < Mo/C. Since Tq has measure > c\T\, 
with c < 1 depending only on d and To, we obtain 



Mo(c|T|)i/« <4||a-P|U,(T) <4e. 



If we define B rl on T, with / the identity, then using (5.14) we obtain 

\\A-B\\l^^t} < (M + r)|T|i/« < 2M|T|i/« < {2Mo/C)\T\^/'^ < (Mo/4)(c|r|)i/9 < 



(5.14) 



(5.15) 



provided C is chosen large enough so that C~^c~^/^ < 1/8. This implies C > 4 and fixes the value of C. 
Thus, we have satisfied the lemma in the case Mq > CM. 

For the remainder of the proof we assume that Mo < CM. We now consider 



u, :— inf inf y*B{x)y. 



(5.16) 



If /i > r/2, we have nothing to prove. So, we assume that /i < r/2 and fix yo with |yo| = 1 and xo & T such 
that P{x) := yoB{x)yo assumes the value /i at xo- We then have from (5.12) 



\P{x) - P{xo)\ < CiMo\T\-^/'^\x -xo\< CiCAf |T|-i/'*|x - xo\, xeT. 



(5.17) 



Let To be the set of a; e T such that \x - Xo\ < r\T\^/'^/{4CiCM). Notice that |To| > cr'^C^'^C~'^M-'^\T\ 
for some constant c only depending on d and To, and \P{x) — ^'(a^o)! < ^/4. It follows that P{x) < n + r/4 
on the subset Tq- Thus, for a{x) := yQA{x)yo > r, this gives that a{x) — P{x) > ^ — fi, x £ Tq and therefore 



\To\'^'{^-f^)<\\a-P\\L,iT)< 



(5.18) 



We now define B ^ B + (|r — so that 



y'B{x)y > -r, \y\ = 1, x e T, 



and 



|^-S||l,(t) < \\A-B\\l^^t) 



|Tr/' < , + (^r - \To\'/'^{\T\\To\-'y^'' < Ce, 



where we have used (5.18) and the value of the measure of Tq- D 

Remark 4 (Form of B). In the setting of Proposition [sj the matrix B takes one of three forms: (i) 
B = B, (ii) B = rl, (iii) B = B + al, for some a > 0. To convert this recipe into a numerical procedure we 
need first the approximation B of A. We construct B in §5.2.2| 

We can now prove the following theorem which shows that there is no essential loss of global accuracy 
by requiring that the approximation i? to ^ be uniformly positive definite. 

Theorem 5.1 (Enforcing positive definiteness globally). Let A be a positive definite matrix valued 
function in Lq{fl) whose eigenvalues are all in [r,M], for each x dzfl, with M >1. If B € E™ satisfies 

||A-B|U,(o) <e, 
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then there is an B G S™ whose eigenvalues are in [r/2, CM] for all x G Q and satisfies 

<Ce, (5.19) 

where C, C are the constants of Proposition 

Proof. Let B = X^TeT ^tXt where each of the matrices Bt have polynomial entries of degree < m and 
XT is the characteristic function of T. If Bt is the matrix from Proposition Q applied to A and Bt on T , 
then B{x) := X^Ter ^^(^)^^(^) eigenvalues in [r/2, CM] for each x £ ft. The error estimate (5.19) 

follows from \\A - Bt\\l,{t) < C\\A - Bt\\l,(t) for each T e T. □ 

If we define cr„(A)^_^(f2) in the same way as (t„, except that we require that the approximating matrices 



are positive definite, then Theorem 5.1 implies i5'Ti(^)L,(f2) < C''''n(^)Lg(n) ^oi all n> 1. 



5.2.2. Algorithms for approximating A. In view of Theorem |5.1[ we now concentrate on approx- 
imating A in Lg{il) without preserving positive definiteness. Let us first observe that approximating A by 
elements from T,^ is simply a matter of approximating its entries. For any d x d matrix B =^ (bij), we have 
that its spectral norm does not exceed ^ \bi,j\ and for any i,j it is at least as large as \bij\. Hence, 



max||6,j|U^(o) < \\B\\L^^n) < V < max . (5.20) 



It follows that approximating A in Lq{fl) by elements of E™ is equivalent to approximating its entries Oij 
in Lq{il,) by piecewise polynomials of degree < m. Moreover, note that A G M^{To, Lq{n)) is equivalent to 
each of the entries Oij being in B^{%, Lq{Q)). 

The analysis given above means that the construction of optimal algorithms for COEFF follow from the 



construction of optimal algorithms for functions in Lq as discussed in S 5.1 If we are able to compute the 
local error E{aij,T)L for each coefhcient Oij and each T, then we can construct a (near) class optimal 
algorithm COEFF for A4''{To, Lq(il)). Moreover, COEFF guarantees that the approximations are positive 
definite and satisfy (3.8); see Proposition [s] and Remark |4] 



6. Numerical Illustration. We present in this section an illustration of the algorithm DISC. We 
consider the L-shaped domain Q = [—5,5] x [—5,5] \ [0,5] x [0,5]. We use {p,S) to denote the polar 
coordinate from the origin (0,0). The diffusion tensor is taken to be A = a /, where / is the 2x2 identity 
matrix. 



a{p) = 



1 if P< Po, 
/i otherwise, 



and Po = 2a/2, p — 5. Define v{p,S) :— p'^^"^ sm{25/3) to be the standard solution on the L-shaped. The 
exact solution engineered to illustrate the performance of DISC is the standard singular solution for the 
L-shaped domain when p < po and a linear extension in the radial direction when p > pq. This is 

( S)- I. "'•^''^^ if P < Po, 

" I v{pq,S) + j^pI^"^ sm{2S/3){p — Po) otherwise. 

Notice that by construction / := — div(AVM) S L'^(il). 

In our numerical experiments we do not use newest vertex bisections. Rather, we use the quad- refinement 
strategy as studied in [7l Section 6] . The initial partition To of is made of quadrilaterals and refinements 
of To are performed using the quad-refinement strategy imposing at most one hanging node per edge, as 
implemented in deal . II [T]. We recall that our theory, although described only for newest vertex bissections, 
applied equally well for quadrilateral or hexahedral subdivisions with limited amount of hanging nodes; see 



[7J Section 6]. We also emphasize that the discontinuity of A is never matched by the partitions. Figure 6.1 
depicts the sequence of the partitions generated by the algorithm DISC implemented within deal. II. 

Given a partition T, the solution u is approximated by continuous piecewise bi-linear finite elements 
subordinate to T. Piecewise constant polynomials are used to approximate the diffusion matrix A and the 
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Fig. 6.1. Sequence of partitions (clockwise) generated by the algorithm with integrability index q — 2 for A, and 
parameters /? — 0.7, uj = 0.8, eo = 2. ] The initial partition (top left) is made of quadrilaterals with hanging nodes and 
all the partitions have at most one hanging node per side. The algorithm DISC refines at early stage only to capture 
the jump in the diffusion. The refinements caused by the singular behavior of the solution at the origin appear later 
in the adaptive process. 



right hand side / by their meanvalues A and /. We point out that the same spectral bounds r ^ 1,M = b 
are vahd for both A and A. 

The parameters are chosen to be /3 = 0.7, w = 0.8 and eo = 2. The standard AFEM loop in PDE is 
driven by error residual estimators together with a Dorfler (T4) marking strategy with parameter 9 = 0.3, 
which is rather conservative. 

We now discuss the choice of p for which Condition p is valid in view of Remark [l] Let f7 = f^i U 
where fti :— {{p,S) G fl : p < Pq/2} and :— ft\ili. The solution u e W^{Lp{il.i)) for any p < 6 in fii and 
u is Lipschitz in f22, whence Condition p is valid for p < 6 in fii, i.e any q — 2p/{p — 2) > 3, and p = oo 
in il2, i.e. any q >2. Since the diffusion coefficient is constant on fii, it leads to zero approximation error 
of A and we only have to handle the jump of A across the circular line {p = po} on fl2. 

Such a jump is never captured by the partitions, thereby making A never piecewise smooth over partitions 
of To and preventing the use of a standard AFEM. It is easy to check that for any 1 < q < oo, the matrix A 
is in Ai^^'^{To, Lq{il,)). Since the performance of DISC is reduced for larger q, according to (4.3), we should 



choose the smallest q — 2p/{p — 1) compatible with u e ^^^(-^^00(^^2)), namely q = 2 for p — 00. The right 
hand side / satisfies / G B^^^(To, L2{n)) C B^^^{To.. H-^(n)), whereas the solution u e A^/^{To, H^ifl)) 
because u e A^/'^{Tq, H^{Q,i)), i = 1,2, and Vm jumps over a Lipschitz hne [TT | fT2 ] . 

To test our theory, we take four different values of p and thus the corresponding q in our numerical 



experiments. For each of these different choices. Figure 6.2 (left) shows the decay of the energy error versus 



the number of degree of freedom in a log — log scale. The computational rates of convergence are 

-0.19 for (7 = 6, -0.23 for q^5, -0.35 for q = 3, -0.48 for q ^ 2, 

in agreement with the approximability of A stated above. These computational rates are close to the 
expected values —1/q, and reveal the importance of approximating and evaluating A within subdomains 
with the smallest Lebesgue exponent q possible. In this example q = 2 yields an optimal rate of convergence 



for piecewise bilinear elements. Figure 6.2 (right) depicts the Galerkin solution after 6 iterations of DISC for 
q = 2. We finally point out that DISC with q = 00, namely with A being approximated in Loo(ll), cannot 
reduce the pointwise error in A beyond 3.96 computationally which is consistent with the jump of A. As a 
consequence, any call of COEFF with any smaller target tolerance and q = 00 does not converge. 
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Fig. 6.2. (Left) Energy error versus number of degrees of freedom for values of q — 2,3,5.6. The optimal rate 
of convergence is recovered for q — 2. (Right) The Galerkin solution together with the underlying partition after 
6 iterations of the algorithm DISC with q = 2. The discontinuity of A is never captured by the partitions and the 
singularities of both A and Vu drive the refinements. 
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